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ABSTRACT 



Let M - <^Q,qg, & ,F^>be a finite automaton over the alphabet 
J . A state q € Q is a dead state iff q ^ F and 

J^et - be a mapping from £* onto the non-negative 
integers defined by A. = 0 ( fi is the empty string) 6fZ,= / $*+X J . 

Define fa) zffofeA .* 0 ,< nj and H A (r\)= #{cferf :#(<*)= ft j . if A is 

regular let be the minimal automaton recognizing A. Each automaton 
M induces a Markov process obtained by considering the inputs to be 
generated by independent rolls of a k-sided fair die. Let p(M) repre- 
sent the probability of being in a final state. Let p(A) = p(M A ). The 
following are proved: 1) ; 2) $ , A regular 

— P* P(A) = 9; 3) p(A) = 0 =$> has the dead state as the only 
absorbing state; 4 ) \/E?0 J a regular set A)M a has a dead state 

aild 1^7^ ^ WherG 5 5) If p(A) = 0, then 

cannot converge to k. With k - 2, these results prove that there is no 
regular set A such that r and = 0. Hence 

there is no 1-1 mapping from the set of all trees representing expres- 
sions involving a binary + and a variable x into fijZj * which pre- 
serves the number of +'s and x's and such that the set of tree images 
is a regular set. 
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I. INTRODUCTION 



Although the theory of finite automata (or sequential machines) is 
itself only about fifteen years old, much effort has already been devoted 
to exactly what automata can and cannot do. And deservedly so, for 
several reasons: First, and perhaps the most ,, pure n of the reasons, is 

that an automaton, by virtue of its definition, has interesting mathe- 
matical properties. Beyond this theoretical consideration, however, lie 
some more practical reasons. Generally speaking, an automaton is the 
simplest model of a digital computer. But since the number of "states 11 
in a computer may be of the order of direct application of re- 

sults of automata theory to an entire computer is certainly not practi- 
cable. Application of automata theory concepts and techniques is 
restricted to systems with relatively small numbers of states (at most 
in the thousands). There are, however, practical situations in which 
this limitation in size is met. The sequential circuits which are the 
basic component of computers are specified by the input-output trans- 
formation which they must realize. The circuit’s operation is described 
in terms of states and since it must be humanly manageable, the number of 
states cannot be too large. In fact, any computing device, organized as 
an iterative array, can be separated into smaller components which will 
be circuits with a small number of states. Automata theory also has ap~ 
plications in flow-charting and program equivalence. 

Many of the results achieved concerning regularity or non-regularity 
of sets have been done in the context of equivalence relations of finite 
index and a fundamental lemma concerning the way certain input sequences 
can be separated. Only three years ago Marvin Minsky and Seymour Papert 
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[10] developed a set of criteria for non-regularity based on a limiting 
quantity which seems intuitively to represent the portion or percentage 
of strings that are in the regular set. In this paper we define for 
regular sets, in terms of natural Markov processes, another quantity, 
p(A), which, generally speaking, again represents a kind of percentage 
of strings in the regular set and show that, when Minsky's limits exist, 
these two quantities agree. From the quantity developed in this paper, 
however, more information can be gathered concerning the machine which 
recognizes the regular set (and hence information about the regular set 
itself) than can be gathered from Minsky's criteria. Also described is 
a special machine which gives an upper bound for the growth rate of sets 
recognizable by a certain type of automaton. Finally it is shown that 
the set: of well-formed trees written as strings is not a regular set. 



6 



II. GENERAL DISCUSSION AND DEFINITION 



We will give here only as much basic automata theory as is necessary for 

the completion and understanding of this paper. More detail and further 

explanation can be found in any number of texts, such as Harrison [6], 

Rabin and Scott [11], or, again, Rabin [2]. 

Before proceeding into the definition of an automaton itself, we 

shall concern ourselves first with the type of input an automaton receives. 

Definition 2.1 An alphabet ^ is a finite set of symbols. 

Definition 2.2 A tape cK is any finite sequence of symbols from the 

alphabet. Included also as a tape is the empty tape , denoted -A- , 

which is the tape with no symbols. Defined on the set of tapes is the 

operation of cone a t e n a t i on , or juxtaposition, i.e., if ^ is a tape 

and 8 is a tape, then is the tape formed by "concatenating 51 

and (3 . If R is a set of tapes and S is a set of tapes then R.S = 

fa.3 : 0<eR and 3^5 j. 

Definition 2.3 R^ = //L l , R n + ^ = R n .R 

C ->00, 



Definition 2.4 R* = 



.Ur 1 



t-0 



Thus we see that the set of all tapes from the alphabet 1 is £*. 
With this operation of concatenation defined as above, <r. a 
is a monoid, and in fact, * is the free monoid generated by ^ . 

For this paper we shall let £={ia}, although the results are 
true for any finite alphabet ^1, 2, 3,...,k ^ . The prime motivation 
for this particular £ is that it is the simplest case having all the 
properties of the general case. The additional fact that most real- 
life machines are binary oriented is another consideration in choosing 
the two element alphabet. Then * is the collection of all possible 
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sequences of l*s and 2 8 s, including the empty string . Here the 

reader must be careful not to associate any numerical significance to 
the symbol "l" or the symbol " 2 These symbols are merely inputs to 
the machine which we now shall define . 

Definition 2.5 A (finite) automaton over the alphabet £ is a 
quadruplet M = ^Q, q^, <ff , F ^ where Q is a finite non-empty set (the 
set of "states"), q^ is an element of Q (the initial state), £ is a 
mapping of Q x ^ into Q (the state transition function), and F is a 
subset of Q (the set of "final" or "accept" states). 

The state transition function & can be extended from Q x ^ to 
Q x ^ * in a very natural way by recursive definition as follows: 

£ (q» A ) = q for all q £ Q 

(f(q, ^.x) = ( £ (q» CX ) ,x) for x <£ £ 

q € Q 

Definition 2*6 The set of tapes accepted or defined b^ the automaton 
M, denoted T(M), is the collection of all tapes C{ in such that 

£) (<1q» ) is an element of F. 

Definition 2.7 A set R is called regular (or recognizable) if and 
only if there exists a finite automaton M such that R = T(M). 

The preceding definition merely says that a set R is regular only 

if there is an automaton M which accepts all tapes in R and rejects all 
-*■ 

others, i.e 0 , £ (q^ 0( ) £ F iff CX 6 R. 

Recall now that has been defined as the set ^ 1, 2 | . In order 
to enable us to lend some numerical bearing to this research we define 
the following one-to-one mapping from /l, 2? * onto the non “-negative 
integers . 

: |l,2j * — Non-negative integers 
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By establishing a sort of alphabetical hierarchy between 1 and 2 in 
we can define the "bar" mapping as: 



A = 0 
1=1 
7 = 2 

Ti = 3 
15 = 4 
21=5 
22 = 6 
TTl = 7 



Thus we now have associated with each member of * a non- 
negative integer, and vice-versa. We call the "bar" mapping the 
encoding mapping and its inverse the decoding mapping. 

As the last general definition, we define a natural relation in 
the monoid <2 ^ 9 A 



Definition 2.8 Let Of 9 Q ^ *. Then 



.f < 6 <=> 3 f e £ 
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XXI. LIMIT THEOREMS 



Let A be a subset of jfl, 2 } *. The question here is what kind of 

automaton could "recognize* 5 A in the sense of definition 2.7? Since for 

any regular set A there can be any number of automata M such that A = T(M) , 
we shall concern ourselves only with the unique minimal automaton recog- 
nizing A and we will call this automaton M. . 

A 

Definition 3,1 / //^(n) = the cardinality of the set 

Definition 3.2 Let 0 { €r ( 1,2^ *. The length of CK , $(p() is the 

e«l-i . __ 



Then we have 



Further- 



KZO 



1*0 



number of digits in 0 

more > . „M) - 7 ? „ a ^ <* * 

v.6^3 '<* + 6 , <XjOGZ . 

Definition 3.3 ^|^(n) - the cardinality of the set {<>(€/]: /fa)* Jlj 

Definition 3.4 A dead state of M^ is a state q £ Q such that 
£ (q» ) € F is satisfied by no <2* €• { / J 2 ] . Since there is 

only one dead state in the minimal automaton, this is equivalent to 
saying d (q, &( ) “ q for all 6 J lj*l j . 

The next three theorems are generalizations and small extensions of 
the work done by Minsky and Papert in [10], and are stated without proof 
since the proofs are, with only minor modifications for theorems 3.5 and 
3.8, essentially identical with those in [10]. The first two theorems 
are concerned with the consequences of M having a dead state, and the 

A 

last theorem indicates what must happen if M^ has no dead state. 

Theorem 3.5 Let M = M^ and suppose: 

(a) £ (q Q , 0( ) is dead 

(b) ^ - qF t % 

(c) lim 1T a (n ) _ q 

n *" TXfr) 
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Conclusions 9 = 1 



From theorem 3*5 we get immediately 

Corollary 3*6 If U for all real ^ and if 

, then A cannot be a regular set whose minimal auto- 
maton has a dead state. 



In some cases the sequence 



fails to converge, in which 



i 

case we can still sometimes use the following theorem. 

Theorem 3.7 Let be the rth member of A in order of magnitude 

under the encoding mapping. That is, there are r-1 members of A whose 
image under the encoding mapping is less than &( r . Then if 



C* 



lim 

r^oo 



r+i 



' o< r 



= 0 






A cannot be a regular set whose minimal automaton has a dead state. 






Theorem 3.8 If A is regular and has no dead state, then 

where N is the number of states of Thus the density of A cannot 

converge to zero. 

Combining these results yields the following criterion. 

Criterion: To prove that a set. A, is not regular, it is 



'4ti 



sufficient to verify condition 1 and condition 2 or 3. 

1 %(n) . 

~~r F" 



Condition 



0 as h 0° 



Condition 2< 



Condition 3< 



H^L l ->o(f) as /j~*t»,and e(S)ti if*l 



Gfntl ~ ^ n 






0 tfs n-*<* 



If A is regular, then by theorem 3.8 has a dead state, but by corollary 
3.5 or theorem 3.7 it has none. Thus we are led quickly to a contra- 
diction. 
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IV. THE PROBABILITY OF ACCEPTANCE 



When using limit theorems as the ones developed in section three 
we run into several problems* First* of course, is the nature of the 
limiting process itself. The limit may or may not exist, and, even if we 
can establish its existence, we may not be able to determine its value. 



cannot begin to evaluate the needed sequences in the theorems. If 

lim ^^^exists, it can be loosely interpreted as the percentage 
n-yoo n 



hence as some sort of probability that a string is in A, If A is regular 
this reduces to a probability that a string is accepted. The above inter 
pretation is all on an intuitive basis and will not be subjected to any 
rigorous analysis. In this section an algorithm is given which defines 
another quantity which, again loosely, can be interpreted as the prob- 
ability that a string is accepted* 

If we assume that the set A is regular, then has a finite number 
of states. Consider the transition function of ^ > written as a 

transition matrix, i.e., if the machine is in state q^ , then with prob- 
ability 1/2 we apply to q^ an input of a H l n , and with probability 1/2 
we apply an input of a This is not to say that the machine is no 

longer working deterministically, but merely that the input is generated 
by a sequence of Bernoulli trials with the probability of a n I” = the 
probability of a n 2 n - 1/2. If we were dealing with a k-symbol alpha- 
bet <l,2, 0 ...kj , then the probabilities would have been 1/k. By 



Secondly, unless 77/j ( n ) is a relatively easy sequence to recognize, we 



strings of { 1,2 j * which are in the set in question, namely A, and 



evaluating 




we can establish which states are 



accessible from q„ with an input of length one. Assume M has N states. 

X A 
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$ 



(q , 1) and £) (q^,2) for i = 0, 1,2, . . . ,N-1 and 



Then by evaluating 
arranging the states of in a square pattern viz., 



h N-1 



^N-l 



4 N-2 



^N~2 



4 0 



M 0 

and letting the q th entry in this pattern be the "probability" that 
^ ) - <1^ given that the length of i s one, we get a matrix 

of values that begins to look suspiciously like a Markov chain. Upon 
further inspection, we see that all entries are either 0 ( s ) - 

for neither - 1 nor 0{ = 2) , 1/2 ( $ (q. } 0{ ) = q j for either 

« = 1 or - 2), or 1 ( S (q^»^ ) = q^ for both 0( = l and = 
2). Finally it is noted that 2b ^ ~ 1 } 

Thus we do indeed have a Markov chain. Let us call this matrix M, since 
it describes somewhat the original automaton. Clearly this matrix does 
not define the automaton since the transition function can only be inter- 
preted generally from this matrix and we have no information whatsoever 
on which states are final states. However, we can glean some information 
from this approach . 

From Feller [3], we know that M can be divided, in a unique manner, 



into closed sets, C, 



^2 $ ° 



such that from any state of a given 



set all states of that set, and no other, can be reached. We shall use, 

without ambiguity, the notation C, to represent both a block matrix in 

x 

M and the set of states making up that block matrix. M can now be re- 
written: 
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Since each can be treated independently as a Markov chain (all entries 

to the right and left of each individual are zero entries), and each 

C. is a closed set, then each is ergodic and the stationary (ergodic) 

probabilities can be calculated. These stationary probabilities are 

merely the probabilities of being in the various states of after a 

large number of steps given that the process was already in a state of 

Cy The results of the ergodic theorem [5] make these probabilities 

independent of the starting state in C . , i.e„, if q £ C, , and p is the 

J a k a 

stationary probability for state q , then p = lim (C. n ) . for all i 

a a r\^>oo k la 

where (C, ) . is the iath entry of the matrix In general, M will 

k ia K 

also contain transient states, i.e., states not in any C^, from which 
states of the closed sets can be reached, but not vice-versa. These 
transient states are carriers, taking the process from its beginning 
to one of the closed sets. 



Let us look more closely at these closed sets C. within the context 

J 

of the entire Markov chain M. Since each C. is closed, i.e., we can 
never leave C. once we are in a state contained in C., each C,, i = 1, 

j j i 

2,...,k taken as a whole is an absorbing state in the original Markov 
chain M. This means that if the process begins in some transient state 
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and travels eventually to a state in Cy then the process has been 

'’"absorbed 91 into CL never again to leave. This may be better shown if 

we let c. be the number of states in C,* i - and associate 

i x 

with M a matrix M® where 




The associated matrix M® is the original matrix M with each CL replaced 

by a c . x c , identity matrix I . If we let r = c n + c rt + . . . + c. and 
JJ c. 12 k 

J 

let s = N-r, then M® has the canonical form 




where I is an r x r identity matrix, 0 is an r x s zero matrix, R is an 
s x r matrix, and Q is an s x s matrix,. The canonical form of M® is recog- 
nized more readily as an absorbing Markov chain. 
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Now (M ff ) n gives the probabilities of being in the various states 
starting from various states after n steps. From Kemeny and Snell [7], 
we know that (M # ) n can be written 




where R is merely the s x r matrix of (M # ) n after we have blocked off 
n 

I, 0, Q n in the manner shown. (Note that R is not necessarily R n ) . 

n 

This form shows that the entries in Q n give the probabilities of being 
in each non-absorbing state after n steps for each possible non-absorbing 
starting state. By Kemeny [7], we know that the probability that the 
process will be absorbed is one. Therefore each entry in Q n must approach 
zero as n approaches infinity, which says that Q n — ^ 0 as n OO • 

(After zero steps the process is in the same non-absorbing state in which 
it started. Hence - I.) But Q n — > 0 is a sufficient condition for 

I-Q to be non-singular [8]. Let K = (I-Q) We call K the fundamental 
matrix for the given absorbing chain. 

Following closely a procedure given by Kemeny and Snell [7], we 
will now describe an algorithm which will compute a quantity which we 
call p(A) that is an indicator of the type of automaton which recognizes 
A and represents, in a sense, the probability that a string is accepted. 
Furthermore, if A is regular, p(A) will always exist. 
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Given a regular set A and the minimal automaton recognizing A, M 

A 9 

construct the state transition matrix M from the transition function* 

For ease of computation put the q^th row as the last row of the M 

matrix* Find the closed sets, the C^, in M. Compute the associated 

matrix M° and put M° in the canonical form* Next compute K = (I-Q) \ 

and let B « KR, where Q and R are as in the canonical form of Mh The 

entries in the bottom row of B (the q^ row) will give us the probabilities 

of ending up in any particular absorption state* These probabilities are 

the absorption probabilities. For each C. of M 5 sum the absorption prob- 

J > 

abilities in the corresponding 1 of M° • These sums are the probabilities 

j 

of ending up in any particular closed set Cy Now either by computing 

lim (C. n ). for some i and j or by solving the following system of equa- 
J ia 

tions, find the stationary probability for each non-transient state. 



Equations 4.1 



(a) pj = Z Pn(Zi) m j 



j = 1,2 ,. .pjC^ where p^ is the stationary 






probability for state q^. 



CL 

<b> z P m ' 1 

•w-l 



J 



Note that in the above system, for each i, there are c^ + 1 equations in 
c^ unknowns. Equation (b) is necessary since there are only c^-1 in- 
dependent equations in (a). Multiplying the stationary probabilities 
of Cj by the sum of the absorption probabilities for that yields the 
final probability of being in each particular state. Now sum these final 
probabilities over all q € F and call this sum p(A). The procedure we 
have just described has been for the general case where q^ is a transi- 
ent state. If q Q is not a transient state, then this implies that M 
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itself is ergodic (other-wise we would not have the minimal automaton) 
and the procedure reduces to finding the stationary probabilities and 
summing the stationary probabilities for all q^ £ F, i*e*, solving 
equations 4*1 with k - 1 and c = N, the number of states in M , and sum- 

■L A 

ming the correct probabilities* Finally we condense this description 
into 



Algorithm 4*2 Given a regular set A, 



ton recognizing A, M 



A 5 






where 




and the minimal automa- 



and p is found by solving system 4*1. 
a 

At this point one may ask what if a transient state is an element 
of F? But since Q n 0, the probability of ending up in any transient 
state is zero, and hence this state would add nothing to p(A). As an 
example for algorithm 4*2, we will compute p(A) for the following 
automaton: S » F where 



Q =s /^os- < ^i» q 2 s>q 3* q 4 } 9 F = f q l s,q 2 !,q 4/ 9 and O(q 0 »l) 

£ ( 3q » 2 ) - q^ » £ f q^ » U — q 3> £ (q^ » 2) — q 2 » £(. q 2 * ^ ^ 

<^(q 2 » 2 ) = q 2 » </(q 3 » 1 ) = %> £ (q 3 » 2 ) = q 3 s 

<^(q 4 »2) = q 3 . 



= q^ 



= q. 



£ (q 49 i) 



= q 



4 9 



M for this automation is. 
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$2 f' 

ft 0 0 o 

h 0 0 0 
0 1 0 0 
h h 0 0 

O /s 0 



s< 



p 




p 



Furthermore, we find that 
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Finally we calculate 




b p b 



Looking at the bottom row of KR we find that the probability of ending 
up in is 0 + 1/4 - 1/4, and the probability of ending up in is 3/4. 
Next we compute the stationary probabilities in and C^ 9 and find that 
the stationary probabilities for q^$ q^ and q^ are 1, 1/2, and 1/2 
respectively. Therefore, the probability of ending up in q^ is 1*3/4 = 
3/4, in q^ is l/4»l/2 = 1/8, and in q^ is l/4*l/2 = 1/8. Summing over 
the states in F, namely q^ and q^, we find that p(A) * 3/4 + 1/8 = 7/8. 

From the definition and construction of p(A) we obtain the follow- 
ing theorem. 

Theorem 4.3 If A is regular and p(A) = 0, then must have a dead 
state. Furthermore, the dead state is the only absorbing state in 
and the transition matrix M is an absorbing Markov chain with one 
absorbing state, namely the dead state. 

Proof: First of all, the assumption that p(A) - 0 implies that the 

matrix M itself cannot be ergodic. If M were ergodic, we could reach 
every state from every other state and from Fisz [5] we would have all 
stationary probabilities positive. But p(A) = 0 implies that the station- 
ary probabilities for states of F are zero. Hence, we have a contradic- 
tion. Therefore, consider the transition matrix of M A in the block form 

A 
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Let us first consider each C, * Again from Fisz since each CL is a closed 

k ° k 

set and every state in C 1 can be reached from every other state in C , 

all stationary probabilities in each are positive * Since the machine 

is minimal 5 each state is accessible which means that the probability of 

being absorbed by each C, is positive* Hence * no state in any C is in 

K k 

F because otherwise p(A) would be positive* This means that we have 

remaining only with states not in F (and there must be at least one 

such since the probability that the process is absorbed is one). 

Since these C^s contain only states not in F and are closed* i.e** if 

q € c k - $ Xsibl}* * $ (p°0 6 F 9 each of these states 

is dead* Because M A is the minimal machine* there can be at most one 

A 

dead state* 

Corollary 4*4 If A is regular and p(A) - 0 9 then for every 






fa 



where 



q_ Is the dead state of M A . 

A 



Proof ; 

has the form 



From theorem 4„3 we know that the transition matrix of M A 

A 
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where the lone absorbing state is the dead state. We know also that the 
probability that the process is absorbed is one. This means that from 
any state of we must be able to get to the dead state. Therefore, 
for any 0 ( 6 £* 3 0 ^ ? £[£(%> &) = ^£> • 



22 



V w THE AGREEMENT OF THE PROBABILITIES 



Thus far we have presented two quantities , lim 




, and 



p(A), both of which seem intuitively to represent the probability that 

fl'Jr,) 



a string is accepted* On the one hand lim 

n 



exists for some 

regular and some non-regular sets, while p(A) exists only and always 
for regular sets* In order for the conclusions drawn from both these 
quantities to be truly valid and meaningful , they should agree where 

_r 

they both exist* In this section we show that this is indeed the case. 



i*e., if A is regular and lim 



ft-*) erf /7 



'ihM s q 



then 0 - p(A)* Before 



we can show this, however, we need the following lemma* 

Lemma 5.1 If ^ A then 3a0}1 — > Q . 

n , 

Proof: Suppose doesn 9 t converge to Q . Then j £>0 

^ f ) 

such that for infinitely many n ^ Q ~t (or <f 0 ~ £ ). 

Let M be this set of n c s* Now let 0(f) - 2 n (the string, not the 



since 



1- 3 ^1 ^ for all n ^ N 

" j < • Similarly, since = jTf , J /Vj 9 

I < % • Frora the hypothesis, 3 9 

j - 0 1 < ° Choose n ,> N 4 1 where N = 



number) 

|L 

<Y n 

for all n ^ N 



for all n ^ N 



max (N^, N^s) N^) and such that n £ M* Again let &(r\ - 2 * Now 




( AVw ) + Ja(v ) _ % { °tn-i ) 4 (n) 
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JL Jt^L ( L (r>) 7 

1 l£? r, -2)+l T-l ] 



Since 0 n-1 >N * 

°^n-i 
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Therefore, for n > N + 1 

> JL f 

W n " * [ °<n-, 




Thus for infinitely many n* 



% (V„ ) 



> 






jl 

3 




> 



-^6 %-hQ + Z - % 



j[\_36 + %] = 9 + % 

T/aM 



But this contradicts the hypothesis that 



Theorem 5*2 If A is regular and 



<?» 



n 



converges to 



e 



W n L — > Q » then Q - p{A). 

\ 3 n w 

Proofs First we define }<x.M to be the cardinality of the 



set {&(€ ,3) - & OYld P3)-fl / . Then clearly 

faZt' 



If we consider the transition matrix of 



M 4 we have two cases. 
A 



Case L M itself is ergodic 0 Then the probability of capture in 

state in n steps is ¥ ■ (» 1 . Taking limits on both sides 

we have, from Fisz [5], (U^>X f M n ) = Pn » where p 

r)^aO jn ' n-=9c0 [' ' J 0 a f a a 



is the stationary probability for state q 



>00. 

Finally, 



p. hM = J^ro f J a (n) 

u n-**oQ n-*>oc> P-j ’ ~5^~ 

r /— n^>o0 jn 

--t po. - 
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Case 2o M is not ergodic. In this case, as in algorithm 4,1, we 
rewrite M in the form 




states of F which are also transient states. Assume now that q £ C . 

a a 

Let T be the set of transient states. Then the probability of capture 



in state in n steps is 




Fix t £ C * and s 6 T $ (R) ^0. If (R) - 0* then this path 

a st st 

adds nothing to the sum of all paths which have length n and end in 

state q . If (R) - 0 for all s and t* then we do not have the mini- 

a st 

mal machine, since this would mean none of the states in C are access- 

a 

ible. Therefore* there exists at least one combination s, t such that 
(R) 7 ^ 0. Let be the probability of capture in state q^ in n 

steps via this certain path. Then 
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We know the following things : 

0 



.>*3 / ■('«); 




-c 


for all s 


*0 . 


from Finkbeiner [4]. 


or else state q would 
s 



(II1) fcz (Q n ) 0% = 0 

Given €>0 9 



(a) Choose NL 9 for n ^ NL 

3 3 



for all 






« /-<? 

(b) Choose 9 for n 



N^ ^ f° r n 

(&2- f 



-/ 

05 



(c) Choose N ^ for n N 

I 1 



where 



'OS (ft)sjt £ 

G-roax{(M). ,(M% 



Let n -> N. 4 ML + N, + 1, 
1 ^ J 






Let 




By (c) since n-^ > N 




Then, using (b) also 




By (a), since n-(N 2 + 1) > Jl + N 3 £ N 3 




p n £ P n pc. -t 3 C i- ^ 



Where 



r- 



€ 



(i-qXM 



ht 



Since £ can be made arbitrarily small, we have 



28 



fr, p - D -?* ’^{l (Q*l ^*\ e ’- 



And finally we arrive at 

l 



£Ortv /I a (ft) _ ^ /?«6i) 

n~^eQ n^>co Zj 

* fit F « 



=ir£ 1 £ faLt 

fceF jteQ seT 

.£££ib &1* 

/&G 

. £ £ £ «, ftl * 

$> eF feCc S€T 



--e 



(a) 



The reader is advised to compare this with algorithm 4 e 2 in which we 
described the construction of pfA). 



f A (n) 



Theorem 5.3 If A Is regular and 0 „ the., Q - p(A). 

Proof: Apply lemma 5.1 and theorem 5.2. 
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VI. THE FIBONACCI MACHINE- -A GROWTH RATE UPPER BOUND 



For the proof of the next theorem, it becomes necessary to look at 
a particular machine of N states, one of which is a dead state, which 
we shall call M^. Described formally, - ^Q, q^, £ , F where 

Q = f q 0* q l > q 2 5 ' “ ° 0 q N - 1 ? 

F ( q 0 > q l’ * * * #q N-2 j 
and the transition function is given by 

/ (q i ,2) = q 0 , i = 0,1,... ,N-2 

$ (q^ » i) = q^_j,^» ^ — 

^ ^N-l' 1 ^ = q N-l 

d <Vr 2> ' = q N-i 

It is obvious from the definition of the transition function that 
is the dead state and is the only absorbing state. Hence p(M^) - 0. 
(here we have substituted the name of the machine for the regular set 
in the argument of p(-), but the meaning is clear.) It is more helpful 
to look at as described by its transition tree graph given below. 




To 
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We have labeled the entire tree as To and the N-l sub-trees as shown* 
(Henceforth we will omit the machine or regular subscript on since 

it is clear what machine we are talking about.) For n < N-l, /{ (n) = 
2 n since all * of length less than N-l are accepted* For 

n = N-l„ . (n) - 2 n -l since only 0 { - ^ is not accepted* Now for 

n > N~l, since the T^, i * 1 ,2, * * * 9 N-1 9 are mutually disjoint, 

where ^ (n) is the number of strings of length n accepted counting 

the length from as a base* Furthermore 9 since each taken sepa- 

rately, is identical to T^ p each alone recognizes the same set as 
Tq, which means that (n)- J T (n)*H(n).i"i and that 



which we recognize as an N-l term Fibonacci sequence* For N s 3 we get 
the familiar two term Fibonacci sequence 
^( 0 ) - 1 
/)(D = 2 
j(2) - 3 
/)(3) - 5 

j(4) = 8 



Because of the type of growth of ^j^(n) we call the N state 

Fibonacci machin e * Since we have a Fibonacci sequence, from Alfred [1], 
we may calculate 

y , Uwu 

77 „ (») 

by solving the polynomial 
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P n M- t N - z *' 1 -z N ' x z-i*o 

P (x) is continuous and P (0) - -1, P xv (2) = 1, which means that 
N JN N 

there exists a solution of P^(x) * 0 that is greater than zero and less 
than two. Furthermore s since for x > 2, 



i 



> 



1 

z-i 





> 



% l 



! 



£ 






we have 



Z 



N-l 




+ %“' 1 +■- + % -t 1 



N-l .. A/-1 

=> -+Nl + 






=> Nl' JI >(N-i) Z N ' 3 -t (N-Z)r? % ■ ■ ■ + 2% f 1 
A/-l) % N ' 3 - (A/-j) l N ' 3 - -3Z-1 >0 

=$> P^(z)>o 



Therefore all solutions of P (x) - 0 are strictly less than two, which 



means that 1 / \ 

a M n Wy < 1 

n ^ M A M Jn) 

Hence for any k, 

£tyyr'U ^ _ fQjrrtj f /ifi^ (zi^l) \ 

^ ’***{]»> JntH ) " J M JnT ) 

_ jCtwlj (izitL (***) ...Uv LJm) . A 

~ AmJh) w 
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With the help of this definition and discussion of the N state Fibonacci 
machine we can prove 

Theorem 6.1 For every £ 0, there exists a finite automaton 



(and hence a regular set A) such that 

KaCnn) 



Q 



where 



a-£ < e <2 

Proofs Since for every € > 0 

%(-h) k - lk --L- 
l.\s-£) - — — -j-e 



A U 

k--i J ■?-« 

choose N sufficiently large so that 









> AM* > 7 

L\i-d 



H 



Form the Fibonacci machine for this N. From the preceding discussion, 

jfavnj (Ml) n 

namely the fact that Jn) ^ G\ we have, for suf- 
ficiently large n 9 (W ^ i and hence that 

A»*(n) Q * . secondly, 



a* 



l(hf >1 ^ C?-ef +---(i-c>i >(a-e) 



N 



hi 



=?(a-ef -(a-cf 1 - (a-e)-i < 0 

=> P N (a-c) < 0 



Since P^(2) = 1 > C for any N (and P (x) is continuous) 

/~) _ h*y\J > D - p 

61 = '•*«» ,u(«) ' * c 
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The Fibonacci machine is especially interesting since it seems that 

) / ) 

for any other machine, M. , with N states and such that — j _ ^ Q s 

^ A ^ r\ 

for all n. Although not rigorously proved 



here, this conjecture lends itself to an induction proof on n that would 
proceed first from the fact that for n ^ N-l the conjecture is obviously 



true 



( /!*„< n) - 2°, n < N-l 



and 



/)rt V ( n ) = 2 n -l for n = N-l) 



since the dead state must be accessible in M , the other machine, in 

n 

N-l or less steps, and then an argument showing successively how must 
have the minimal number of elements in state q^ T (q^ ^ is the dead 
state), q^ at each level less than n, and hence the maximal 

number of elements in state at each level less than or equal to n. 



thus 



implying that at level n + 1, . Intuitively 



speaking this conjecture is not hard to accept since in there is one 
and only one way to get to the dead state and this path is as long as it 
can possibly be. Furthermore, all other states besides the dead state 
are accept states. 

With this conjecture in mind, the Fibonacci machine gives a growth 
rate upper bound for machines which have only one absorbing state, namely 
the dead state. This growth rate upper bound is seen in theorem 6.1 to 
be arbitrarily close to two, but always less than two. That is to say, 
for any machine M which has only the dead state as an absorbing state 

(nt*) 



lira 

n-=> oo 
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VII. THE SET OF TREES AS AN INPUT 



For all automata previously discussed, the input has been a one 
dimensional tape, i 0 e 0 , a string of symbols* A tree automaton is essen- 
tially the same as an ordinary finite automaton except for the input 
which is, as the name implies, the set of well formed trees* In order 
to describe this set we need the following definitions . 



The set of all wel Informed trees is a regular set in the sense 
that it can be recognized by a tree automaton* The question here is t 
Does there exist a one-one mapping from the set or well-formed trees to the 
set such that the number of times an element, of A appears in the 
tree is the same as the number of times that that particular element 
appears in the string which is its image and such that the image of the 
set of trees is a regular set in the sense of definition 2*7? For simpli- 
city, and again without loss of generality, we consider the case where 



Definition 7*1 A ranked alphabet is a pair 




finite set of symbols and 




the set of non-negative integers* 



If we further define 



A o •{at A :<T(a)*o\ 
A i 25 {a^A-^M’i j 



A k =fa&/I : <r(a)*A] 



then A - $ a V ^ 0 “ ’ U Aft 



Definition 7*2 0{ is a jtreg, iff 0( 6 or 

where 





n 
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A =/>,£] and 6-(thj, £T(Z)=0 . The condition imposed upon 
the one-one mapping reduces to saying that if a tree has k -f’s and (k + i) x < 8 
(all trees must have one more x than +) „ then the string which repre- 
sents that tree must also have k +°s and k + 1 x ff s. An example of a tree 
for this A is -}- 



A A 



A 



p A 



Assume | is a mapping which 
¥■ 



Let Y be the set of trees over 
satisfies the above conditions*, i.e., (j> : Y i T(M) =^(Y) 

for some M (we assume M to be minimal) and the strings in (j) (Y) have 
the correct number of appearances of each element in them. From Knuth 



[9] we know that the number of trees with n +’s and n + 1 x's is 

) A A 

C>n - [ f) )• Let B 00 * The restrictions on Q) give us 



) a ( 3 W ) ® 0 and Jefan+l) > Cn for all n. Using Stirling' 

n\n 



approximation to the factorial, i.e., nl £ l } * we find that 







(iM)JWn ■ 



Then 



2 



an 



. Ss- a 0 



a 



8ft+l 



3 



an-ti 



and 



0 

a 3n " r n 



0 



Therefore, 

lAul n p(B) - 0 

3 n 

must then be a machine with only one absorbing state, namely the 
dead state. But if we look at the sequence 



Hefan+l) _ _ ntx ( ) 

' c -' ' i(V) 
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we see 
Hence, 



. #/)* -in 
~ n u tn 

that the growth 
a contradiction 



— ? 

rate is too large for a machine of this type* 
and so no such mapping exists* 
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VIII. SUMMARY AND CONCLUSIONS 



Looking back now on the problem of section 7 9 we can see why some- 
thing more than the criteria given in section 3 was needed to solve it. 
First of all, so little was known about how the supposedly regular set 

'Tf'e(n') 

B behaved that it would have been difficult to decide even if lim — 

H 

existed. Secondly, although theorem 3.7 could be applied, it was not 
a sharp enough criterion to give any useful information. Essentially 
all that was known was that HqM-o and h (ml) -- C„ . However, 
with this new concept of p(B) 9 and knowing theorems 4.4 and 5.2, we 
could deduce information about what type of machine would be needed to 
recognize B, namely a machine with a single absorbing state -- the dead 
state. Next, with the Fibonacci machine providing a growth rate upper 
bound for this particular type of machine we could show that /igfo) grew 
too fast for this type machine and hence B could not be regular. This 
problem was especially difficult, since the growth rate of ^6l}was 
just barely greater than allowable and a really fine line had to be 
drawn to mark the cutoff point. Therefore, p(-) is an improvement over 



fin) 

n 



in that it is a more specific indicator giving more in- 



lim 

formation about the machine, namely not only the existence of a dead 
state, but that the dead state is the only absorbing state. Coupling 
this with the ® 8 maximality H of the Fibonacci machine yields another 
criterion for showing a set to be non-regular. 
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The following are proved: - 

3) p(A) = 0=^M. has the deacT state a 



Let P< A ) = P( M A ) • 

2) , A regular =?>p(A) = O 

only absorbing state; 4) 3 a regular set A $ M^has a dead state and ^ 6 



^e\ 

s the 
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